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ABSTRACT 



This paper presents results of an approximation study of 
cyclic queueing phenomena that occur in multiprogrammed computer 
systems. Based on Wald's Identity and using ideas of diffusion, 
the objective is to develop convenient and nearly explicit formulas 
relating processor utilization in such systems to simple program 
parameters and the level of multiprogramming. Some numerical results 
to indicate the quality of the proposed approximation are given. 
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ABSTRACT , 

This paper presents results of an approximation study of cyclic 
queueing phenomena that occur in multiprogrammed computer systems. 

Based on Wald's Identity and using ideas of diffusion, the objective 
is to develop convenient and nearly explicit formulas relating 
processor utilization in such systems to simple program parameters 
and the level of multiprogramming. Some numerical results to indicate 
the quality of the proposed approximation are given. 

1. I ntroduction . 

In a previous paper [2] we have initiated an approximation 
study of cyclic queueing phenomena that occur in multiprogrammed 
computer systems. Particular attention was focused upon processor 
utilization estimation, as the latter depends upon the statistical 
properties of programs. The basis for the approximation was the 
observation that under ’’heavy traffic” conditions it is plausible to 
approximate the flow of programs in a multiprogrammed computer system 
by means of a diffusion or Wiener process with appropriate infinitesimal 

This author is also a consultant to IBM Research. 
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parameters and boundary conditions. The results were seen to be 
usefully accurate, as judged numerically, and to be of an extremely 
simple analytical form. They can thus be put to use for at least 
preliminary design purposes, with follow-up refined analysis or 
simulation furnishing further corrections if needed. 

One deficiency of the results of [2] is that they tend to 
misestimate CPU utilization (i.e. the long-run fraction of time that 
the CPU is busy) when CPU service or processing times come from 
distributions of greater positive skewness than the exponential. In 
the present paper we wish to alter our approximation so as to render 
it more accurate in the case of such hyper-exponential-appearing CPU 
service times. This change is important, since currently available 
data indicates that greater-than-exponential skewness is not uncommon. 



3 



2. The Model . 

We suppose, as we did in [2] , that J programs are in the 
Central Processing Unit (CPU) - Data Transfer Unit (DTU) cycle. Each 
program is (i) in the process of awaiting, or receiving, service at 
the CPU, at the termination of which (ii) it repairs to the DTU, again 
queueing as if at a single server. Having received the requisite 
information at the DTU stage it then returns to the CPU stage. This 
process continues indefinitely. When programs are completed and thus 
removed from the system new programs are immediately reintroduced. A 
diagram indicating the situation appears below. 

The assumptions made concerning program behavior are the 
following : 

(a) The sequence of CPU service or processing times is one of 
independent identically distributed random variables (i.i.d.r.v.) 
{C^ , i = 1,2,...}. 

(b) The sequence of DTU service or auxiliary memory access and data 

transfer times is also one of i.i.d.r.v., {D.}. 

1 

(c) CPU and DTU processing times are mutually independent. Further- 
more, we must assume the following. 

— sC 

(d) The Laplace transform, E[e ], of a generic CPU service time 
converges for -s^ < s < 0, for some s^ > 0. This latter is 
truly a mathematical restriction, but is probably not a serious 
one; all gamma densities, and convex combinations of exponentials 
(hyperexponentials) are covered, for example. 
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3 . Analysis of the Model . 

In summary, our present approximate analysis of the multi- 
programming model proceeds by first attempting to find an appropriate 
set of parameters y and a in the diffusion equation 



= _ IF o£ 

at ^ ax 2 a^ 



(3.1) 



Here F(x,t) is the approximate distribution of the number of jobs 
in the CPU stage at time t. An argument for obtaining parameters y 
and which was based on asymptotic renewal theory appears in [2]. 

Then we truncate the stationary distribution to allow for the boundary 
at X = J, and compare several methods for determining a crucial 
constant (denoted by B in [2]) that allows us to deal with the 
boundary at x = 0. The latter is important for it is directly 
related to CPU utilization, which it is our intention to estimate. 

Consider the waiting time, S^, of the n — customer to arrive 
at an ordinary single server — one in which there is no restriction 
placed upon the number waiting. The latter model would approximate 
the behavior of a cyclic queue or multiprogramming system in which 
the number of programs J is unlimited. We shall assume, as is realistic, 
that the CPU service rate outstrips that of the DTU, i.e. E[C] < E[D] . 

Now Feller ([1], pp. 194-198) shows that W^ has the same distribution as 
the maximum of the partial sums of the unrestricted random walk 
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W M = max[0,S, ,S^,. . . ,S ] 

n n 1’ 2 n 



where 



S = X, + + ... + X 

n 1 2 n 



and 



\ = "k - \* 



To study M , invoke Wald’s Identity see Feller ([1], p 



n 



or Kingman [4] 



sS 



N 



N 



= 1 , 



N being the random time at which a boundary is reached, and 



, , . _ r sX-, ^ r sC. ^ r -sD . 

i|j(s) = E[e ] = E[e ] E[e ] . 



Now place a boundary at x > 0, and another at -b, b > 



sS. 



,N 






sS 



P{S^ > x} + E 



N 









P{Sj^ < -b} 



If E[C] < E[D] it may be shown that the equation 



(3.2) 

(3.3) 

(3.4) 

. 603) 

(3.5) 

(3.6) 

0 . Then 
1 (3.7) 



4'(s) = 1 



(3.8) 
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has a solution at s = 0, and one at ^ > 0. Put s = ^, let 
b and observe that then 



E{e |s^>x} 

This is the probability that the unrestricted random walk S ever 
exceeds the boundary at x, and is, by (3.2), equal to the probability 
that the waiting time exceeds x. We write this as 



P{W > x} 



-sx 
e — 



E{e N |S^>X} 



(3.10) 



where - x > 0 represents the excess ; if we neglect the latter we 

obtain the estimate 



P{W > x} ^ e ~ ; 
if X is large we have the approximation 

P{W > x} « C e — 

By the result of Haji and Newell [3], the number, Q, of 
customers in the queue is the number that arrive during the waiting 
time of an arbitrary customer; reference is to the stationary dis- 
tributions of both W and Q. Conditionally, 

P{Q ^ n I W = x} = (x) , 



(3.11) 
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where G is the distribution function of D, and * represents 
Stieltjes convolution. Then, by (3.11) above. 



P{Q ^ n} 



„n* . . -sx 
G (x)e — 



s dx = C [G(s) ] 



n 



(3.12) 



where G(£^) is the Laplace-Stielt jes transform of G, evaluated at 
This effectively states that, at least under heavy traffic condi- 
tions (p = g - |. - pY barely < 1) the stationary distribution of the number 
in the system is exponential, but with parameter somewhat different 
from that of the diffusion approximation: 



Diffusion: 



P{Q ^ x} 



2p 



(3.13 a) 



where y 



1 H 2 = Var[Dl Var[C] 

E[D] E[C] (E[D]) (E[C]) 



Wald: 



P{Q ^ x} 



[In G(s) ]x 
e — = 



[G(s)]^ 



(3.13 b) 



see Gaver and Shedler [2]. For a new approximation we then merely 
replace the ratio by Jin G(^) and fit constants as was done in 

[2]. The relation between the parameters in the diffusion approxima- 
tion expressed by (3.13a) and that in the approximation resulting 
from Wald's Identity (3.13b), is considered in the Appendix. 

Given the values of y and the stationary diffusion 

approximation for the distribution F of Q satisfies 



0 



o£ d^F 

dx 2y d^ ’ 



(3.14) 
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2u 

in which we now propose to replace by Jin G(£) . We also must 

determine the constant B in the solution to (3.1A). 

(3.15) 

where the latter expression satisfies the boundary condition at 
X = J: F(J;J) = !• Here we have introduced the notation F(x;J) 

to emphasize the dependence upon the parameter J. The constant 
a > 0 can be determined either by an argument based on asymptotic 
normality in conflicting renewal processes (see [2]), or as we have 
argued, using Wald-Haj i-Newell results. 

We now present two ways in which B can be determined. 
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4 . Fitting the Constant B: Approximations for CPU Utilization ^ 

We suggest and investigate two ways in which the constant B in 
(3.15) can readily be determined. 

Method 1 . If J = oo then it is well known (see Takacs [6], p.l42) that 
server (CPU) utilization is 



Hence it follows that to achieve this approximately for large J 



This approach was taken in [2] with good results for exponential CPU 
service. 

Method 2 . With probability F(J-1;J) there is at least one program 
in residence at the DTU. Hence the long-run input to the CPU should 
be 1 • F(J-1;J), assuming that E[D] = 1. Now the long-run output 
rate from the CPU must equal the input rate, and the output rate 




(4.1) 



we should put B = = p , from which it follows that 



1 

F(x;J) = — 7 O^x^J; a>0 

" ^ -aJ 

1 - p e 



(4.2) 





then, 
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from which we find that 

o = P 

2 , -a(J-l) -aJ 

1 + p e - e 

Of course as J -> 

We shall shortly provide some numerical comparisons that 
illustrate the behavior of the two methods when they are applied to 
actual measurement data. 
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5 . Special Cases . 

We now describe the manner in which our approximations may 
be applied when certain specific distributions are in force. 

Case 1 : CPU service exponential, E[C] = X DTU service Erlang ~ k, 

E[D] = 1. 

In this case equation (3.8) has the form 



^(s) 




(5.1) 



It must be solved numerically for _s, a task that can be carried out 
by Newton-Raphson iteration. 



Case 2 : CPU services exponential; DTU service constant, E[D] = 1. 

For this limiting case of (5.1) let k ® to obtain the 
equation 




(5.2) 



Case 3 : CPU services Erlang - k. 



E[C] = X 



DTU service constant, 



E[D] = 1. 



Here we must solve 




“S 

e 



k 



1 . 



(5.3) 
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Case 4 : CPU services hyperexponential; DTU services constant, 

E[D] = 1, 

Representation of CPU services by means of a convex combina- 
tion of exponentials (the hyperexponential distribution) suggests 
itself according to actual program trace data. This model leads to 
the equation 



^1 ^2 



e ® = 1 



(5.4) 



where 



E[C] = = p + (l-p)X^ ^ 



and p takes on an appropriate value between zero and unity. In 

practice it is convenient (if not statistically efficient) to fit the 

parameters of Cases 3 and 4 by the matching of low moments from model 

and data. Supposing that A^^<E[C]<A-^, it can be shown that, 

^ 1/2 

given E[C] and Var[C] such that along with 



E[C] 



-1 



A 2 5 p and A, are uniquely determined. 

Unfortunately, all of the above models require the numerical 
solution of a transcendental equation in order to generate actual 
numerical estimates of CPU utilization. This disadvantage is not 
possessed by the diffusion approximation of [ 2 ], 

It is of interest that our procedure gives results in complete 



accord with an exact analysis in one particular case. 
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Case 5 . CPU and DTU services exponential. 

This case can easily be analyzed by simple birth-and-death 
process methods, for which see [2]. Our procedure demands that we 
first solve 




(5.5) 



which in this case has the explicit solution ^ = A - 1; conse- 
^ 1 

quently G(^) = y = p. Then the approximation yields 

1 - B. 

F(Oh;J) i-y. (i = 1,2) 

1 - B. p'^ 

Here refers to the constant B as determined by Method i (i = 1 or 2) . 

But for the present model we have 



B^ = 



1 + 



p[p ] - 



= p = B 



1 ’ 



(5.6) 



and use of 




yields 



F(Of;J) 




(5.7) 



so our approximation is in this case equal to the birth-and-death result. 



For our other cases exact equality will not hold. 
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6 . Numerical Results . 

We now present numerical results to indicate the quality of 
the proposed approximation. Our examples are in the context of a 
single processor system with two-level memory, multiprogrammed and 
operated in a demand paging environment. A discussion of cyclic 
queueing phenomena in such systems is given in Lewis and Shedler [5]. 
Accordingly, we interpret the CPU service times in our model as 
execution intervals, i.e. times between page exceptions as programs 
execute in (constrained) memory of given capacity. We concentrate 
on Case 4 above (CPU services hyperexponential , DTU services constant) 
on the basis of our experience that execution intervals often fit 
well to a hyperexponential model. The assumption of constant DTU 
service times arises from the consideration of average access time 
along with the time to transfer a page of information. 

In all cases we shall consider, values for p, and y^ 

in the hyperexponential were obtained by matching first and second 
moments of the empirical distribution obtained from actual program 
data. 

Tables 1 and 2 contain numerical results for CPU utilization 
obtained by the approximation technique (for both methods of fitting 
the constant B) along with results of exact analysis based on semi- 
Markov (S-M) methods as given in [5]. 
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Table 1: CPU Utilization Comparisons 







Approx 


Approx 


J 


S-M 




®2 


2 


.3903 


.1909 


.3972 


3 


.4054 


.2486 


.4274 


4 


.4178 


.2924 


.4440 


5 


.4280 


.3264 


.4545 


6 


.4367 


.3534 


.4616 


7 


.4439 


.3751 


.4668 


8 


.4501 


.3927 


.4708 


9 


.4553 


.4072 


.4736 


10 


.4598 


.4193 


.4759 


E[C] = 4871, 


Var[C[ = 


.26492 X 10^, 


1 — 1 
II 

1—1 

1 Csl 
<< 




E[D] 


= 10,000 









Approx 


Approx 


J 


S-M 


®i 


^2 


2 


.2216 


.1455 


.2216 


3 


.2286 


.1789 


.2313 


4 


.2333 


.2003 


.2361 


5 


.2366 


.2144 


.2388 


6 


.2388 


.2238 


.2404 


7 


.2403 


.2301 


.2415 


8 


.2413 


.2344 


.2422 


9 


.2420 


.2373 


.2426 


10 


.2425 


.2393 


.2429 


E[C] = 4871, 


Var[C] = 


.26492 X 10^, 


1 — 1 
II 

1— 1 

1 CM 




E[D] 


= 20,000 
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Table 2: CPU Utilization Comparisons 



J 


S-M 


Approx 

B, 


Approx 


2 


.4076 


1 

.0770 


2 

.4249 


3 


.4281 


.1094 


.4579 


4 


.4449 


.1385 


.4764 


5 


.4587 


.1649 


.4882 


6 


.4702 


.1887 


.4964 


7 


.4798 


.2105 


.5024 


8 


.4879 


.2304 


.5070 


9 


.4948 


.2485 


.5106 


10 


.5006 


.2654 


.5136 


E[C] = 


10,735, Var[C] 


= .12313 X 


10^°, 




E[D] 


= 20,000 





J 


S-M 


Approx 

B, 


Approx 

B^ 


2 


.5316 


1 

.2148 


2 

.5993 


3 


.5548 


.2884 


.6667 


4 


.5752 


.3481 


.7064 


5 


.5935 


.3974 


.7326 


6 


.6098 


.4388 


.7511 


7 


.6245 


.4741 


.7650 


8 


.6379 


.5045 


.7757 


9 


.6500 


.5309 


.7842 


10 


.6611 


.5542 


.7911 


E[C] = 


17,026, Var[C] 


= .39780 X 


10^°, ■ 




E[D] 


= 20,000 





2953 



3682 
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Finally, we present some results of CPU utilization obtained 
by trace-driven simulation of the cyclic queueing system. By this 
we mean that CPU service times in the model were taken to be the 
actual sequence of execution intervals derived from a program trace, 

J copies of this sequence being multiprogrammed . In Table 3, these 
trace-driven results are displayed, along with values of CPU utiliza- 
tion obtained by the approximation technique. 

Table 3: Trace-Driven Simulation 

CPU Utilization Comparisons 









Approx 




Approx 


J 




Trace 








3 




.227 


.1789 




.2313 


6 




.229 


.2238 




.2404 




E[C] 


= 4871, Var[C] = .26492 x 


10^ 






E[D] 


= 20,000 












Approx 




Approx 


J 




Trace 






"2 


3 




.419 


.1094 




.4579 


6 




.425 


.1887 




.4964 




E[C] 


= 10,735, 


Var[C] = . 


12313 


X 10^^ 






E[D] 


= 20,000 












Approx 




Approx 


J 




Trace 


"l 




"2 


3 




.538 


.2884 




.6667 


6 




.546 


.4388 




.7511 




E[C] 


= 17,026, 


Var[C] = . 


39780 


X 10 



E[D] = 20,000 
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7 . Summary and Conclusions . 

This paper presents the results of approximating processor 
utilization in multiprogrammed computer systems using ideas of 
diffusion. In particular, the objective is to develop convenient and 
nearly explicit formulas relating CPU utilization to simple program 
parameters and to the level of multiprogramming. 

Numerical comparisons indicate that a reasonably effective 
approximation has been obtained when the constant is utilized. 

Examples show that for the actual program traces studied our present 
approximation is superior to that of [2], which assumed exponentially 
distributed CPU service times. Data from our trace material is far 
more skewed (long-tailed) than that yielded by the exponential. 
Research continues in an attempt to improve the approximate procedures 
obtained to date. A promising approach is the iteration of our 
approximate solutions. Of course, an eventual goal is that of 
obtaining simple but adequate approximations to properties of some- 
what more complex and truly realistic networks of servers. 
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Appendix 



The relation between the parameter in the diffusion approximation 
of [2], as expressed in (3.13a), and that in the approximation resulting 
from Wald’s Identity, (3.13b), will now be considered. Application 
of Wald’s Identity requires that we find the positive root, of 

0*8). Let us expand i(;(s) in Taylor’s series: 

s^ o 

i|;(s) = 1 + sy^ ^x (A-1) 



where the remainder is o(s^), provided that required moments exist. 
Here 



- E(X) = E[C-D] < 0 

- Var[X] = Var[C] + Var[D]. 



(A-2) 



At ^ we have from (A-1) and (3.13b), after dispensing with the root 
at s = 0, 



y^ + ^ + r (£) = 0 , 



or 



2y 



s a 

— X 



X , . . 2 

7 + ^ + -rz 



X 



IT(8) 



= 0 



(A-3) 



(A-4) 



Therefore, if we consider a sequence of queueing situations in which 
^ 0 and does not approach zero, the remainder term approaches 

zero, since r(s) = 0(s) . We see then that as ^ 0, 

— X 



(A-5) 
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or 



s ~ - 




(A-6) 



X 

In the event that ^ 0 our Wald approximation and the 

approximation of [2] coincide, as will now be shown. For s approach- 
ing zero, as will be true in heavy traffic. 



- £n G(^) = s E[D] + o(s) 



(A-7) 



Consequently the parameter in the Wald-Haj i-Newell approximation 
becomes in heavy traffic 



- ^ E[D] 



y e[d] 
2 ^ ^ ■ 
X 



2 (E[D]-E[C])E[D] 
Var[D] + Var[C] 



1 ^ 

E[C] E[D] 

Var[D] Var[C] 

(E[D])^E[C] (E[D])^E[C] 

1 _ 1 

E[D] E[C] ^ M 

Var[D] _ Var[C] 

TeTdIF TeTcTF 



(A-8) 



For the specific models introduced earlier in Section 5 it is 
clearly sufficient to allow the mean CPU service time to approach unity 
from below in order to force ^ to zero. Consider, for example, 

Case 3: letting y = E[C] increase it is apparent that for every 

fixed s, ip(s) y the left-hand side of (5.3), increases, and ^ 
moves continuously towards the origin; when ^ = 1 there is a (double) 
root at s = 0. A similar effect occurs when, say, 0 in (5.4), 
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a maneuver that allows E[C] to approach unity. Again ip(s) is 
increased for every s, and in the limit there is a double root at 
s = 0. Recall that the region of convergence of the transform ip(s) 
is s < minCA^jX^) = s,A and since ^ < s a decrease in either A^^ 
or A^ eventually sends ^ to zero. Examination of the denominator 
of (3.10) suggests also that if ^ is near zero the expectation is 
near unity, thus further justifying the use of our approximation. 
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